NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Recurrent neural networks as neuro-computational models of human speech recognition

https://doi.org/10.1371/journal.pcbi.1013244

Brodbeck, Christian; Hannagan, Thomas; Magnuson, James S (July 2025, PLOS Computational Biology)
Theunissen, Frédéric E (Ed.)
Human speech recognition transforms a continuous acoustic signal into categorical linguistic units, by aggregating information that is distributed in time. It has been suggested that this kind of information processing may be understood through the computations of a Recurrent Neural Network (RNN) that receives input frame by frame, linearly in time, but builds an incremental representation of this input through a continually evolving internal state. While RNNs can simulate several keybehavioralobservations about human speech and language processing, it is unknown whether RNNs also develop computational dynamics that resemble humanneural speech processing. Here we show that the internal dynamics of long short-term memory (LSTM) RNNs, trained to recognize speech from auditory spectrograms, predict human neural population responses to the same stimuli, beyond predictions from auditory features. Variations in the RNN architecture motivated by cognitive principles further improved this predictive power. Specifically, modifications that allow more human-like phonetic competition also led to more human-like temporal dynamics. Overall, our results suggest that RNNs provide plausible computational models of the cortical processes supporting human speech recognition.
more » « less
Free, publicly-accessible full text available July 28, 2026
Complexity and diversity in sparse code priors improve receptive field characterization of Macaque V1 neurons

https://doi.org/10.1371/journal.pcbi.1009528

Wu, Ziniu; Rockwell, Harold; Zhang, Yimeng; Tang, Shiming; Lee, Tai Sing (October 2021, PLOS Computational Biology)
Theunissen, Frédéric E. (Ed.)
System identification techniques—projection pursuit regression models (PPRs) and convolutional neural networks (CNNs)—provide state-of-the-art performance in predicting visual cortical neurons’ responses to arbitrary input stimuli. However, the constituent kernels recovered by these methods are often noisy and lack coherent structure, making it difficult to understand the underlying component features of a neuron’s receptive field. In this paper, we show that using a dictionary of diverse kernels with complex shapes learned from natural scenes based on efficient coding theory, as the front-end for PPRs and CNNs can improve their performance in neuronal response prediction as well as algorithmic data efficiency and convergence speed. Extensive experimental results also indicate that these sparse-code kernels provide important information on the component features of a neuron’s receptive field. In addition, we find that models with the complex-shaped sparse code front-end are significantly better than models with a standard orientation-selective Gabor filter front-end for modeling V1 neurons that have been found to exhibit complex pattern selectivity. We show that the relative performance difference due to these two front-ends can be used to produce a sensitive metric for detecting complex selectivity in V1 neurons.
more » « less
Full Text Available
Partitioning variability in animal behavioral videos using semi-supervised variational autoencoders

https://doi.org/10.1371/journal.pcbi.1009439

Whiteway, Matthew R.; Biderman, Dan; Friedman, Yoni; Dipoppa, Mario; Buchanan, E. Kelly; Wu, Anqi; Zhou, John; Bonacchi, Niccolò; Miska, Nathaniel J.; Noel, Jean-Paul; et al (September 2021, PLOS Computational Biology)
Theunissen, Frédéric E. (Ed.)
Recent neuroscience studies demonstrate that a deeper understanding of brain function requires a deeper understanding of behavior. Detailed behavioral measurements are now often collected using video cameras, resulting in an increased need for computer vision algorithms that extract useful information from video data. Here we introduce a new video analysis tool that combines the output of supervised pose estimation algorithms (e.g. DeepLabCut) with unsupervised dimensionality reduction methods to produce interpretable, low-dimensional representations of behavioral videos that extract more information than pose estimates alone. We demonstrate this tool by extracting interpretable behavioral features from videos of three different head-fixed mouse preparations, as well as a freely moving mouse in an open field arena, and show how these interpretable features can facilitate downstream behavioral and neural analyses. We also show how the behavioral features produced by our model improve the precision and interpretation of these downstream analyses compared to using the outputs of either fully supervised or fully unsupervised methods alone.
more » « less
Full Text Available

Search for: All records